Fast cross-validation via sequential testing
نویسندگان
چکیده
With the increasing size of today’s data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the power of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy.
منابع مشابه
Fast Cross-Validation via Sequential Analysis
With the increasing size of today’s data sets, finding the right parameter configuration via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses non-parametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming cand...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملSoftware-based Testing of Sequential VHDL Descriptions
In this paper, we propose a new high-level test pattern generation technique for sequential circuits. The main motivation is two-fold: on one hand, we elaborate test data for design validation; on the other hand, we deal with the problem of structural test development at functional level. The proposed test method, i.e. mutation testing, allows us to work with a fault model at software level on ...
متن کاملLogic Design Validation via Simulation and Automatic Test Pattern Generation
We investigate an automated design validation scheme for gate-level combinational and sequential circuits that borrows methods from simulation and test generation for physical faults, and verifies a circuit with respect to a modeled set of design errors. The error models used in prior research are examined and reduced to five types: gate substitution errors (GSEs), gate count errors (GCEs), inp...
متن کاملMultivariate Analysis of fMRI using Fast Simultaneous Training of Generalized Linear Models (FaSTGLZ)
We present an efficient algorithm for simultaneously training elastic-net-regularized generalized linear models across many related problems, which may arise from bootstrapping, cross-validation and nonparametric permutation testing. Our approach leverages the redundancies across problems to obtain ≈ 10x computational improvements relative to solving the problems sequentially by the standard gl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 16 شماره
صفحات -
تاریخ انتشار 2015